210 research outputs found

    A study of alternative splicing in the pig

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since at least half of the genes in mammalian genomes are subjected to alternative splicing, alternative pre-mRNA splicing plays an important contribution to the complexity of the mammalian proteome. Expressed sequence tags (ESTs) provide evidence of a great number of possible alternative isoforms. With the EST resource for the domestic pig now containing more than one million porcine ESTs, it is possible to identify alternative splice forms of the individual transcripts in this species from the EST data with some confidence.</p> <p>Results</p> <p>The pig EST data generated by the Sino-Danish Pig Genome project has been assembled with publicly available ESTs and made available in the PigEST database. Using the Distiller package 2,515 EST clusters with candidate alternative isoforms were identified in the EST data with high confidence. In agreement with general observations in human and mouse, we find putative splice variants in about 30% of the contigs with more than 50 ESTs. Based on the criteria that a minimum of two EST sequences confirmed each splice event, a list of 100 genes with the most distinct tissue-specific alternative splice events was generated from the list of candidates. To confirm the tissue specificity of the splice events, 10 genes with functional annotation were randomly selected from which 16 individual splice events were chosen for experimental verification by quantitative PCR (qPCR). Six genes were shown to have tissue specific alternatively spliced transcripts with expression patterns matching those of the EST data. The remaining four genes had tissue-restricted expression of alternative spliced transcripts. Five out of the 16 splice events that were experimentally verified were found to be putative pig specific.</p> <p>Conclusions</p> <p>In accordance with human and rodent studies we estimate that approximately 30% of the porcine genes undergo alternative splicing. We found a good correlation between EST predicted tissue-specificity and experimentally validated splice events in different porcine tissue. This study indicates that a cluster size of around 50 ESTs is optimal for <it>in silico </it>detection of alternative splicing. Although based on a limited number of splice events, the study supports the notion that alternative splicing could have an important impact on species differentiation since 31% of the splice events studied appears to be species specific.</p

    Extraction, integration and analysis of alternative splicing and protein structure distributed information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alternative splicing has been demonstrated to affect most of human genes; different isoforms from the same gene encode for proteins which differ for a limited number of residues, thus yielding similar structures. This suggests possible correlations between alternative splicing and protein structure. In order to support the investigation of such relationships, we have developed the Alternative Splicing and Protein Structure Scrutinizer (PASS), a Web application to automatically extract, integrate and analyze human alternative splicing and protein structure data sparsely available in the Alternative Splicing Database, Ensembl databank and Protein Data Bank. Primary data from these databases have been integrated and analyzed using the Protein Identifier Cross-Reference, BLAST, CLUSTALW and FeatureMap3D software tools.</p> <p>Results</p> <p>A database has been developed to store the considered primary data and the results from their analysis; a system of Perl scripts has been implemented to automatically create and update the database and analyze the integrated data; a Web interface has been implemented to make the analyses easily accessible; a database has been created to manage user accesses to the PASS Web application and store user's data and searches.</p> <p>Conclusion</p> <p>PASS automatically integrates data from the Alternative Splicing Database with protein structure data from the Protein Data Bank. Additionally, it comprehensively analyzes the integrated data with publicly available well-known bioinformatics tools in order to generate structural information of isoform pairs. Further analysis of such valuable information might reveal interesting relationships between alternative splicing and protein structure differences, which may be significantly associated with different functions.</p

    Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3

    Get PDF
    Background: Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described. Results: Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates. Conclusion: We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution

    Geographic variation and factors associated with female genital mutilation among reproductive age women in Ethiopia: A national population based survey

    Get PDF
    Background: Female genital mutilation (FGM) is a common traditional practice in developing nations including Ethiopia. It poses complex and serious long-term health risks for women and girls and can lead to death. In Ethiopia, the geographic distribution and factors associated with FGM practices are poorly understood. Therefore, we assessed the spatial distribution and factors associated with FGM among reproductive age women in the country. Method: We used population based national representative surveys. Data from two (2000 and 2005) Ethiopian demographic and health surveys (EDHS) were used in this analysis. Briefly, EDHS used a stratified, two-stage cluster sampling design. A total of 15,367 (from EDHS 2000) and 14,070 (from EDHS 2005) women of reproductive age (15-49 years) were included in the analysis. Three outcome variables were used (prevalence of FGM among women, prevalence of FGM among daughters and support for the continuation of FGM). The data were weighted and descriptive statistics (percentage change), bivariate and multivariable logistic regression analyses were carried out. Multicollinearity of variables was assessed using variance inflation factors (VIF) with a reference value of 10 before interpreting the final output. The geographic variation and clustering of weighted FGM prevalence were analyzed and visualized on maps using ArcGIS. Z-scores were used to assess the statistical difference of geographic clustering of FGM prevalence spots. Result: The trend of FGM weighted prevalence has been decreasing. Being wealthy, Muslim and in higher age categories are associated with increased odds of FGM among women. Similarly, daughters from Muslim women have increased odds of experiencing FGM. Women in the higher age categories have increased odds of having daughters who experience FGM. The odds of FGM among daughters decrease with increased maternal education. Mass media exposure, being wealthy and higher paternal and maternal education are associated with decreased odds of women's support of FGM continuation. FGM prevalence and geographic clustering showed variation across regions in Ethiopia. Conclusion: Individual, economic, socio-demographic, religious and cultural factors played major roles in the existing practice and continuation of FGM. The significant geographic clustering of FGM was observed across regions in Ethiopia. Therefore, targeted and integrated interventions involving religious leaders in high FGM prevalence spot clusters and addressing the socio-economic and geographic inequalities are recommended to eliminate FGM. © 2016 Setegn et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

    A phylogenetic generalized hidden Markov model for predicting alternatively spliced exons

    Get PDF
    BACKGROUND: An important challenge in eukaryotic gene prediction is accurate identification of alternatively spliced exons. Functional transcripts can go undetected in gene expression studies when alternative splicing only occurs under specific biological conditions. Non-expression based computational methods support identification of rarely expressed transcripts. RESULTS: A non-expression based statistical method is presented to annotate alternatively spliced exons using a single genome sequence and evidence from cross-species sequence conservation. The computational method is implemented in the program ExAlt and an analysis of prediction accuracy is given for Drosophila melanogaster. CONCLUSION: ExAlt identifies the structure of most alternatively spliced exons in the test set and cross-species sequence conservation is shown to improve the precision of predictions. The software package is available to run on Drosophila genomes to search for new cases of alternative splicing

    Allelic Gene Structure Variations in Anopheles gambiae Mosquitoes

    Get PDF
    Allelic gene structure variations and alternative splicing are responsible for transcript structure variations. More than 75% of human genes have structural isoforms of transcripts, but to date few studies have been conducted to verify the alternative splicing systematically.The present study used expressed sequence tags (ESTs) and EST tagged SNP patterns to examine the transcript structure variations resulting from allelic gene structure variations in the major human malaria vector, Anopheles gambiae. About 80% of 236,004 available A. gambiae ESTs were successfully aligned to A. gambiae reference genomes. More than 2,340 transcript structure variation events were detected. Because the current A. gambiae annotation is incomplete, we re-annotated the A. gambiae genome with an A. gambiae-specific gene model so that the effect of variations on gene coding could be better evaluated. A total of 15,962 genes were predicted. Among them, 3,873 were novel genes and 12,089 were previously identified genes. The gene completion rate improved from 60% to 84%. Based on EST support, 82.5% of gene structures were predicted correctly. In light of the new annotation, we found that approximately 78% of transcript structure variations were located within the coding sequence (CDS) regions, and >65% of variations in the CDS regions have the same open-reading-frame. The association between transcript structure isoforms and SNPs indicated that more than 28% of transcript structure variation events were contributed by different gene alleles in A. gambiae.We successfully expanded the A. gambiae genome annotation. We predicted and analyzed transcript structure variations in A. gambiae and found that allelic gene structure variation plays a major role in transcript diversity in this important human malaria vector

    SAW: A Method to Identify Splicing Events from RNA-Seq Data Based on Splicing Fingerprints

    Get PDF
    Splicing event identification is one of the most important issues in the comprehensive analysis of transcription profile. Recent development of next-generation sequencing technology has generated an extensive profile of alternative splicing. However, while many of these splicing events are between exons that are relatively close on genome sequences, reads generated by RNA-Seq are not limited to alternative splicing between close exons but occur in virtually all splicing events. In this work, a novel method, SAW, was proposed for the identification of all splicing events based on short reads from RNA-Seq. It was observed that short reads not in known gene models are actually absent words from known gene sequences. An efficient method to filter and cluster these short reads by fingerprint fragments of splicing events without aligning short reads to genome sequences was developed. Additionally, the possible splicing sites were also determined without alignment against genome sequences. A consensus sequence was then generated for each short read cluster, which was then aligned to the genome sequences. Results demonstrated that this method could identify more than 90% of the known splicing events with a very low false discovery rate, as well as accurately identify, a number of novel splicing events between distant exons

    Width of Gene Expression Profile Drives Alternative Splicing

    Get PDF
    Alternative splicing generates an enormous amount of functional and proteomic diversity in metazoan organisms. This process is probably central to the macromolecular and cellular complexity of higher eukaryotes. While most studies have focused on the molecular mechanism triggering and controlling alternative splicing, as well as on its incidence in different species, its maintenance and evolution within populations has been little investigated. Here, we propose to address these questions by comparing the structural characteristics as well as the functional and transcriptional profiles of genes with monomorphic or polymorphic splicing, referred to as MS and PS genes, respectively. We find that MS and PS genes differ particularly in the number of tissues and cell types where they are expressed.We find a striking deficit of PS genes on the sex chromosomes, particularly on the Y chromosome where it is shown not to be due to the observed lower breadth of expression of genes on that chromosome. The development of a simple model of evolution of cis-regulated alternative splicing leads to predictions in agreement with these observations. It further predicts the conditions for the emergence and the maintenance of cis-regulated alternative splicing, which are both favored by the tissue specific expression of splicing variants. We finally propose that the width of the gene expression profile is an essential factor for the acquisition of new transcript isoforms that could later be maintained by a new form of balancing selection
    corecore